This document summarises the simulation results presented in the manuscript Non-parametric efficient estimation of marginal structural models with multi-valued time-varying treatments, including all tables in figures.
The approximate true values are as follows:
U1 value 428.0682954, 3208.262406
U2 value: IPW: 2.5774786, -0.2170228; Nelder-Mead optimization: 2.5571779, -0.214558; CG optimization: 2.5775728, -0.2167315
I also generated the numerator of the stabilized weights using a multinomial model (using only prior treatment history as covariates), we find similar results:
U1 value 0.6511902, 5.4334605
U2 value: IPW: 2.56737, -0.2157752; Nelder-Mead optimization: 2.5476633, -0.2134972; CG optimization: 2.5776594, -0.2165482
But the weights from the stabilized weights are much more stable than the non-stabilized ones:
## true_p1 true_p2 true_p3 true_p4
## Min. : 3.558 Min. : 9.083 Min. : 27.55 Min. : 74.83
## 1st Qu.: 4.280 1st Qu.: 17.339 1st Qu.: 70.59 1st Qu.: 296.69
## Median : 4.511 Median : 21.132 Median : 93.63 Median : 425.18
## Mean : 4.999 Mean : 25.011 Mean : 125.03 Mean : 625.67
## 3rd Qu.: 5.327 3rd Qu.: 27.861 3rd Qu.: 143.18 3rd Qu.: 698.08
## Max. :10.687 Max. :196.530 Max. :3613.96 Max. :66456.64
## true_p1 true_p2 true_p3 true_p4
## Min. :0.7799 Min. :0.3678 Min. :0.1625 Min. : 0.06445
## 1st Qu.:0.9222 1st Qu.:0.7706 1st Qu.:0.6910 1st Qu.: 0.65517
## Median :0.9976 Median :0.9544 Median :0.9000 Median : 0.86337
## Mean :1.0002 Mean :1.0005 Mean :1.0002 Mean : 1.00011
## 3rd Qu.:1.0488 3rd Qu.:1.1106 3rd Qu.:1.1259 3rd Qu.: 1.13853
## Max. :1.3373 Max. :3.1629 Max. :7.0144 Max. :15.21569
Of note I also recorded the expected bias that would be observed (by estimating treatment probabilities the same way we do in the simulations), it leads to the following values: 1.1510768, -0.0609292.
This section is subcategorized in 2 parts because we initial choices of learners lead to a very slow run time and it seemed like I would not get results in a reasonable amount of time. I then switched to a more basic set of learners to get results faster. The first section reflects this limited set of learners, the second section reflects the extended set of learners.
Done with 200 iterations of each sample size.
stackr = list("mean", "lightgbm", "multinom","xgboost", "nnet",
"knn", "rpart", "naivebayes","glmnet",
list("randomforest", ntree = 250, id = "randomforest"),
list("ranger", num.trees = 250, id = "ranger")
)
stackm = c('SL.mean','SL.glmnet','SL.earth','SL.glm.interaction')
Takeaways: Results are overall good for SDR, but less so for TMLE. Clearly now the weights are very well estimated as soon as we reach sufficient sample size. The outcome model probably needs to be improved a little to get consistent results in scenario 2.
## NULL
Takeaways: Results are overall good for SDR, but less so for TMLE. Perhaps improvinf the outcome model would help in some scenarios.
## quartz_off_screen
## 2